Experience Replay is widely used for off-policy reinforcement learning. With cpprb, you can start your experiment quickly without implementing troublesome replay buffer.
Heavy calculation is implemented with C++ and Cython. cpprb is usually faster than Python naive implementation.
cpprb supports Ape-X on single computer. You don’t need to think problematic lock. cpprb locks only critical section internally well.
cpprb adopts flexible environment. Any numbers of Numpy compatible environment values can be stored.
Any questions, requests, and so on are welcome.
TF2RL provides a set of reinforcement learning algorithms for TensorFlow 2. TF2RL uses cpprb for off-policy algorithm.
You can find awesome repositories using cpprb. We’re looking forward to seeing your great work will show up.